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Overview: 

The TSI S14001A... Researching this chip was like trying to batter down a brick wall with your head, at first. No datasheet, no 
known patent (initially), but demand for emulation (and possibly FPGA or PIC re- implementation to repair arcade and pinball 
boards). The chip is mentioned in "Talking calculator [for the blind] incorporates 1-chip mu C plus custom microcontroller" -- 
Source: EDN, v 21, n 11, 5 J une 1976, p 35-7 

I found a copy of the EDN article at my university library, and as it turns out, the most important piece of information was in that 
article: The name of the person who designed the compression and synthesis scheme used in the chip. That person was Forrest S. 
Mozer, a physics professor at Berkeley who, as it turns out, also designed the compression scheme used in the Digitalker chip 
from National Semiconductor, the compression used on several 'SSI' C64 games such as Ghostbusters, and the compression 
schemes used by Sensory Inc. on their chips. (Sensory Inc. was created by Mozer's Sons, basically to market his technology!) 
Once I knew this, I compared the ROMs from each, and indeed they all do seem to use a very similar format. (I haven't checked 
Ghostbusters yet though.) But MOST importantly, it gave me a patent to look at to learn about the device: US Patent 4,214,125 
Also, patents 4,433,434 and 4,435,831 contain additional useful data which helped with figuring out the mirroring method used in 
the delta demodulator. 

History: 

The TSI S14001Awas developed by TeleSensory, Inc. in 1975 as a single- 1 C speech chip for a portable talking calculator for the 
blind, called the Speech-i-. The speech technology was licensed (I believe with a 3 year exclusive deal) from Forrest S. Mozer, a 
professor of atomic physics (speech was a spare time thing for him) at Berkeley. Forrest Mozer would encode the speech in his 
basement laboratory using his novel form of speech encoding (the encoding process apparently involved several minicomputers 
running FFTs and a spectrum analyzer), and then General Instruments would make the resulting speech data into a mask ROM to 
be used with the TSI chip. In 1978, Dr. Mozer made another exclusive 3-year license: to National Semiconductor, who used his 
design in their Digitalker, Digitalker II, and Microtalker ICs, the latter two of which apparently never saw the light of day. Later, in 
1983 or 1984, after his sons first company, SSI, went bankrupt, they started a second company called "Sensory Circuits Inc." , 
later shortened to "Sensory Inc.", which used a slightly modified form of the Digitalker compression in their products, and called it 
"MX". The TSI S14001A was used in SIX products, as far as I'm aware so far: (Please feel free to point out any more you may 
discover) 

• [1976] The TeleSensory Inc. "Speech-i-" Talking calculator (which it was originally designed for) 

• [1978 or 1979] Atari's unreleased prototype 'Wolf Pack' arcade machine (Thanks to Stefan Jokisch for pointing this one 
out) 

• [1979] The Fidelity Electronics TALKING Chess Challenger (NOT the plain Chess Challenger, the logo being separate on the 
Talking one and not on the normal one) (Thanks to Kevin Horton for pointing this one out) 

• [1980] The Canon "Canola SP1260" adding machine (both US and GERMAN versions are known to exist, with different 
speech ROMs) 

• [1979-1982] The Stern VSU-100 speech board, which was used in six stern pinball machines: Flight 2000, Catacomb, 
Freefall, Lightning, Orbitor-1, and Split Second, with different ROMs in each case. (It also may have been used in the 
Prototype 'Cue' machine which never made it to full release) 

• [1980-1981] The Stern VSU-1000 speech board, which was used in the arcade games 'Berzerk' and 'Frenzy', and the 
unreleased prototype of 'Moon War' (but NOT the final version of 'Moon War', which was on Konami 'Scramble' hardware). 

I'm looking for copies of the speech ROMs from the two SP1260 versions and the 'Cue' machine (if it used an S14001A), contact 
me if you have one of these! I did learn some fascinating and somewhat useless footnotes to history about the interaction of TSI 
and MITalk, Kurzweil, and how TSI eventually split into TSI and Speech Plus, then seemed to re-merge with itself later. Maybe I 
can write a doc on the other TSI speech 'chip', the mutilated-MITalk-based Prose 2000 series, which was used in a number of 
other TSI 'unlimited text to speech' speech synthesizers. Based on the fact that both the Canon instruction manual, the Stern 
schematics, the silkscreen on the Stern VSU-1000 PCB, and a few patents on devices which probably never saw the light of day, I 
conclude that the original datasheet probably called it a 'Custom ROM Controller' or 'CRC chip'. Which is pretty much what it was. 
Go figure. 

I STI LL don't have a copy of the original datasheet, so if anyone finds a copy, please, PLEASE scan it and send it to 
me! 



Pinouts: 



Based on some very good scans of the Canon adding machine board, and the Stern VSU-1000 schematics, I was able to work out 
a mostly complete pinout of the chip. TheTSI S14001A is a 40Pin DIP IC. Its an ASIC, and has a 4-bit DAC built in. The bottom of 
the chip on the VSU-1000 PCB I own has "Philippines" and "S170X" on it. 

Here's the pinout, as far as I can tell. It looks rather haphazard compared to most 'modern' chips: IDx are the word input lines Ax 
are the word ROM address lines Dx are the word ROM data lines /BUSY is low while a word is being played/said START is pulled 
high when a word is to be said and the word number is on the input lines The Canon 'Canola' uses a separate 'ROM strobe' signal 
independent of the chip to either enable or clock the speech ROM. Its likely that they did this to force the speech chip to stop 
talking, which is normally impossible. 
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Note from Kevin Norton when testing the hookup of the S14001A: the /BUSY line is not a standard voltage line: when it is in its 
HIGH state (i.e. not busy) it puts out a voltage of -10 volts, so it needs to be dropped back to a sane voltage level before it can be 
passed to any sort of modern IC. The address lines for the speech ROM (AO-All) do not have this problem, they output at a TTL/ 
CMOS compatible voltage. 

Operation: 

Put the 6-bit address of the word to be said onto the I DO-I D5 lines. Then clock the START line. As long as the START line is held 
high, the first address byte of the first word will be read repeatedly every clock, with the enable line . The signal is just passed 
through the chip. Once START has gone low-high-low, the /BUSY line will go low until 3 clocks after the chip is done speaking. 

For example, lets have the chip play word 03 from the Berzerk ROMs. For two clocks, the chip will read the word address high 
nibble from byte 6 (one with /ROMEN low, one with it high) Then, for another two clocks, read the low address nibble from byte 7 
(again one clock with /ROMEN low, one with it high). For simplicity's sake assume all reads have this 2-clock behavior, and that all 
reads hence take two clocks. The next two clocks, the chip reads the syllable address pointed to by bytes 6 and 7. Since locations 
06 and 07 have the bytes 05 EO, we read the next byte from 0x5E. The S14001A will read a lookup index location in the ROM 



(called the 'word memory' in the patent) and then based on that index data, will read data from an address based on the data 
read in the ROM (this is called the 'syllable memory'), and start speaking. 

ROM Format: 

03 AO (epiphany at 16:50 on 12/04/2005) means word data starting at 0x003A 

17 Id (epiphany at 17:05 on 12/04/2005) means phoneme? (syllable) data starting 

at 0x017xx ? 
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The first area of ROM is the word table. This table holds the addresses where each word's data starts. In the Berzerk speech ROM, 
it runs from 0x0000 to 0x0039, but theoretically it can run from 0x0000 to 0x007F if all 64 possible words are used. To decode 
addresses stored in this table, read them as big endian 16-bit values, and rightshift them by 4. 

The second area of ROM is the syllable table. This table holds the addresses and parameters of the delta-modulation-encoded pcm 
data to be played back. In the Berzerk ROM, it runs from OxOOBA to 0x016B, with four FFs after that to pad the ROM to exactly 
0x170. To decode the data stored in this table, read them as pairs of bytes. 

The first byte of data is the high 2 nibbles of the address, i.e. 17 xx means the data starts at 0x170 
The second byte, or parameter byte, is formatted as such: 

76543210 
GBYSSSRR 

• G is set if this is the last syllable of a word. 

• B is CLEAR if the syllable is played through straight only once instead of mirroring after the block end. 

(B being clear disables all repeats of phones, BUT then R acts as an additional multiplier for total number of phones 
played) 

If B is clear, double the number of total syllables (i.e 4x the number of audible since all are now audible unless the 
silence bit is set) are played in the same time period as otherwise. 

• Y is set if the syllable is silent, the syllable is read and probably rendered normally internal to the chip, but the output 
DAC is held at 0x07. 

• S is the length of the word in syllables, number of syllables is 8 - (this_number). 

• R is the number of times a syllable repeats, number of repeats is 8 - (2 * this_number). 



B, Y, S and R can also be described by a table: 
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And all these numbers +8 for when they occur as the last syllable, to account for G. 

The third, and final area of the ROM, is the phoneme data. This area holds the delta-modulation-compressed samples of speech. 
The encoding format is a bit strange, the patent calls it "floating-zero, two-bit delta modulation". 

Decoding the Phoneme Data: 

Here is how the phoneme data is decoded, according to the patent: 

(initially and after a reset, the old 2-bit data (old_in) is 10b (2), the accumulator output is 7 (accumulator is connected to the DAC 
through an analog switch, and it runs from 0/low to F/high)), and the data shift count is 0) 

• Step 1: Grab the (direction ? next : previous) data byte. 

• Step 2: Mask the (direction ? high : low) two bits of the data byte and put into a register (cur_in). This is the 2-bit 
encoded delta data. 

• Step 3: This part is a little tricky and is handled by a table in a PROM normally. 

If the direction is 1 (backwards), SWITCH the current and old values before feeding to the table. 

Here's the table. X means we don't care. 

Old 2 bits: Current 2 bits: 4-bit signed Output: Value of Output: 



Old 2 bits: Current 2 bits: 4-bit signed Output: 



MSB 


LSB 


MSB 


LSB 


MSB 






LSB 


Value 


0 


X 


0 


0 


1 


1 


0 


1 


-3 


0 


X 


0 


1 


1 


1 


1 


1 


-1 


0 


X 


1 


0 


0 


0 


0 


0 


0 


0 


X 


1 


1 


0 


0 


0 


1 


1 


1 


X 


0 


0 


1 


1 


1 


1 


-1 


1 


X 


0 


1 


0 


0 


0 


0 


0 


1 


X 


1 


0 


0 


0 


0 


1 


1 


1 


X 


1 


1 


0 


0 


1 


1 


3 



• step 4: Take the 4-bit signed output and add it to the signed old final output, the result is the unsigned 4-bit dac output. 
Yes, it's a little weird, but it should work. 

• Step 5: Copy cur_in to old_in, shift data byte (direction ? left : right) by two, add one to shift count. 

• Step 6: If the shift count is less than 4, shift the input byte right by two, and go to step 2. 

If we're immediately after the mirror point in a mirrored sample, the last accumulator output is simply repeated and not 
recalculated using the delta, the old/new deltas update as usual though. **this is very important!** see the S14001A.C code at 
http://www.netaxs.eom/~gevaryah/S14001A.c for an example delta demodulator with proper mirroring. 

Reading the Data: 

Once we've demodulated our data block Otherwise check if we're done our syllable: 

• If we're not, zero the shift count and go to step 1. 

• If we are, then on the S14001A we reset the decoder and output a block of silence exactly the same length as the syllable 
was. Then we check if we need to repeat the syllable, and we do so (including the silence) as many times as the parameter 
byte dictates. After we repeat the syllable the required number of times, we check if we're done our word: 

• If we're not, retrieve the next syllable address and syllable parameters, reset the decoder, then start playing that new 
data. 

• If we're done our word completely, I.E. the high parameter bit of the most recently played syllable was set (to indicate 
that it is the last one in a word), then output silence. 

Berzerk Stuff: 



Based on the above information, the Syllable table words for Berzerk are: 
INDEX: 

Word number (address in syllable table of word data) "word spelled out" (comments) : Syllable data of word first syllable byte pair 
(most similar "diphone code" to sound played) more syllable byte pairs 

/ before a "diphone code" means that diphone is played silent due to the Y bit. 

00 (OxOSA) "help" : 17 ID 17 ID ID 49 ID 6E 20 9F 
17 ID (IH) 
17 ID (IH) 



ID 49 (EL) 
ID 6E (/EL) 
20 9F (P) 



01 (0x044) "kill" : 22 IF 22 IF 17 IE 24 41 28 D9 

22 IF (K) 

22 IF (K) 

17 IE (IH) 

24 41 (UH) 

28 D9 (L) 



02 (0x04E) "attack" : 29 59 29 78 2D IF 2D IF 2D 7F 2A 49 2A 78 22 9F 

29 59 (UH) 

29 78 (/UH) 

2D IF (T) 

2D IF (T) 

2D 7F (/T) 

2A 49 (A) 

2A 78 (/A) 

22 9F (K) 



03 (Ox05E) "charge" : 2d If 2d If 2f Ic 37 41 37 7c 2d If 3b 5b 2f 9c 

2D IF (DT) 

2D IF (DT) 

2F IC (CH) 

37 41 (AR) 

37 7C (/AR) 

2D IF (DT) 

3B 5B (J) 

2F 9C (CH) 



04 (0x06E) "got" : 22 If 3c 51 3e 51 3e 6e 2d 9f 

22 If (K) 

3c 51 (G) 

3e 51 (AH) 

3e 6e (/AH) 

2d 9f (DT) 



05 (0x078) "shoot" : 2f Id 2f le 2f Id 40 50 40 6e 2d 9f 

2f Id (CH) 

2f le (CH) 

2f Id (CH) 

40 50 (GO) 

40 6e (/GG) 

2d 9f (DT) 



06 (0x084) "get" : 22 If 42 49 42 78 2d 9f 

22 If (K) 
42 49 (EH) 
42 78 (/EH) 
2d 9f (DT) 



07 (0x08c) "is" : 45 41 49 Ic 49 9c 

45 41 (IH) 

49 Ic (SZ) 

49 9c (SZ) 



08 (0x092) "alert" : 51 59 52 41 52 78 2d 9f 

51 59 (UHL) 

52 41 (ER) 
52 78 (/ER) 
2d 9f (DT) 



09 (0x09A) "detected" : 2d If 56 52 56 79 2d If 58 51 58 78 2d If 5a 51 5a 78 2d 9f 
2d If (DT) 
56 52 (EE) 
56 79 (/EE) 



2d If (DT) 

58 51 (EH) 

58 78 (/EH) 

2d If (DT) 

5a 51 (KTIH) 

5a 78 (/KTIH) 

2d 9f (DT) 

OA (OxOAE) "the" : 5c le 5c le 60 cl 

5c IE (THV) 

5c IE (THV) 

60 Cl (UEH <schwa>) 

OB (0x0B4) "in" : 64 41 68 d9 

64 41 (IH) 

68 D9 (NN) 

OC (0x0B8) "it" : 69 50 69 6e 2d 9f 

69 50 (IH) 

69 6E (/IH) 
2d 9f (DT) 

OD (OxOBE) "their/there" : 5c le 5c le 6b 5a 6c cl 
5c IE (THV) 
5c IE (THV) 
6b 5A (EI) 
6c Cl (R) 

OE (0x0C6) "where" : 70 41 74 dl 

70 41 (WHE) 
74 Dl (ER) 

OF (OxOCA) "humanoid" : 17 Id 17 Id 76 41 7a 41 7e 59 7e 7c 2d 9f 

17 ID (IH) 

17 ID (IH) 

76 41 (YUU) 

7A 41 (MAHN) 

7E 59 (OY) 

7E 7C (/OY) 

2D 9F (DT) 

10 (0x0D8) "coins" : 22 If 22 If 22 7e 7e 41 83 58 49 Id 49 9d 
22 IF (K) 

22 IF (K) 
22 7E (/K) 
7E 41 (OY) 

83 58 (N) 
49 ID (SZ) 
49 9D (SZ) 

11 (0x0E6) "pocket" : 20 If 17 le 84 51 84 78 22 If 22 7f 86 58 86 78 2d 9f 
20 IF (P) 

17 IE (IH) 

84 51 (lAH) 
84 78 (/lAH) 
22 IF (K) 

22 7F (/K) 
86 58 (ID) 

86 78 (/ID) 
2d 9F (DT) 

12 (0x0F8) "intruder" : 87 49 87 7c 2d If 2d If 2d 7e 8f 5b 8a 5a 8b cl 

87 49 (IN) 
87 7C (/IN) 
2D IF (DT) 
2D IF (DT) 
2D 7E (/DT) 



8F 5B (R) 
8A 5A (00) 
8B CI (DER) 



13 (0x108) "no" : 96 51 98 dO 
96 51 (N) 
98 DO (OWE) 



14 (OxlOC) "escape" : 9A 58 49 ID 49 ID 49 7C 22 IF 17 IE 9B 49 9B 79 20 9F 

9A 58 (EH) 

49 ID (SZ) 

49 ID (SZ) 

49 7C (/SZ) 

22 IF (K) 

17 IE (IH) 

9B 49 (AY) 

9B 79 (/AY) 

20 9F (P) 



15 (OxllE) "destroy" : 2D IF 9E 52 49 ID 49 ID 49 79 AO 41 A4 D9 

2D IF (DT) 

9E 52 (EE) 

49 ID (SZ) 

49 ID (SZ) 

49 79 (/SZ) 

AO 41 (TR) 

A4 D9 (OY) 



16 (0xl2C) "must" : A5 41 49 ID 49 ID 49 78 2D 9F 

A5 41 (MUH) 

49 ID (SZ) 

49 ID (SZ) 

49 78 (/SZ) 

2D 9F (DT) 



17 (0x136) "not" : A9 49 A9 78 2D 9F 

A9 49 (NOH) 

A9 78 (/NOH) 

2D 9F (DT) 



18 (0xl3C) "chicken" : 2D IF 2D IF 2F IC AC 58 AC 79 22 IF 22 7F AD C9 

2D IF (DT) 

2D IF (DT) 

2F IC (CH) 

AC 58 (IH) 

AC 79 (/IH) 

22 IF (K) 

22 7F (/K) 

AD C9 (N) 



19 (0xl4C) "fight" : BO ID BO ID B6 49 B6 78 2D 9F 

BO ID (F) 

BO ID (F) 

B6 49 (lY) 

B6 78 (/lY) 

2D 9F (DT) 



lA (0x156) "like" : B9 41 B9 78 22 IF 17 9F 
B9 41 (LIY) 
B9 78 (/LIY) 
22 IF (K) 
17 9F (IH) 



IB (0xl5E) "a" : BD CI 
BD CI (A) 



IC (0x160) "robot" : 8F 41 8F 7C 20 IF 93 49 93 78 2D 9F 



8F 41 (RO) 

8F 7C (/RO) 

20 IF (P) 

93 49 (OH) 

93 78 (/OH) 

2D 9F (DT) 



Encoded Phonemes: 



The Berzerk encoded phonemes are: 



0x170 
OxlDn 
0x20n 
0x22n 
0x24n 
0x28n 
0x29n 
0x2An 
0x2Dn 
0x2Fn 
0x37n 
0x3Bn 
0x3Cn 
0x3En 
0x40n 
0x42n 
0x45n 
0x49n 
0x51n 
0x52n 
0x56n 
0x58n 
0x5An 
0x5Cn 
0x60n 
0x64n 
0x68n 
0x69n 
0x6Bn 
0x6Cn 
0x70n 
0x74n 
0x76n 
0x7An 
0x7En 
0x83n 
0x84n 
0x86n 
0x87n 
0x8An 
0x8Bn 
0x8Fn 
0x93n 
0x96n 
0x98n 
0x9An 
0x9Bn 
0x9En 
OxAOn 
0xA4n 
0xA5n 
0xA9n 
OxACn 
OxADn 
OxBOn 



(used as a schwa+f ricative 



IH 
EL 

P (plosive) 
K 

( ' front ' 



for words like 'hustle' and 'humanoid') 



( 'bucket ' ) 

'rat' 'pack' 'sat') 
(plosive, not mirrored) 
(fricative, also used as 



( ' pocket ' ) 

( ' bet ' f descending? ) 
(voiced S sound like in 'zebra') 

regret', ascending?) 



(voiced 
(schwa) 



UH 
L 

UH 
A ( 
DT 
CH 
AR 
J 
G 

AH 
00 
EH 
IH 
SZ 
UHL 
ER 
EE 

EH ( 
KTIH 
THV 
UEH 
IH 
N 
IH 
EI 
R 

WHE 
ER 
YUU 
MUHN 
OY 
N 

I AH 

ID (almost 
IN 
00 
DER 
RO 

OH (AWH, like 

N 

OWE 

EH 

AY 

EE 

TR 

OY 

MUH 

NOH 

IH 

N 

F (fricative) 



SCH, not mirrored?) 



TH as in ' thee ' ; 



like it ends with an N) 



in ' rock ' ; 



0xB6n = lY 

0xB9n = LIY 

OxBDn = A (long A) 



